Discrete Hierarchical Organization 
of Social Group Sizes 

Wei-Xing Zhou, 1 Didier Sornette, 1 ' 2 ' 3 * Russell A. Hill, 4 Robin I.M. Dunbar 5 * 

institute of Geophysics and Planetary Physics, 
University of California, Los Angeles, CA 90095, USA 

2 Department of Earth and Space Sciences, 
University of California, Los Angeles, CA 90095, USA 
3 Laboratoire de Physique de la Matiere Condensee, 
CNRS UMR 6622 and Universite de Nice-Sophia Antipolis, 
06108 Nice Cedex 2, France 
Evolutionary Anthropology Research Group, 
Department of Anthropology University of Durham, 43 Old Elvet, 

Durham DH1 3HN, UK 
5 British Academy Centenary Project, School of Biological Sciences, 
University of Liverpool, Crown St., Liverpool L69 7ZB, England 

*To whom correspondence should be addressed. 
E-mail: sornette@moho.ess.ucla.edu (D.S.) and rimd@liverpool.ac.uk (R.I.M.D.). 

The "social brain hypothesis" for the evolution of large brains in primates has 
led to evidence for the coevolution of neocortical size and social group sizes. 
Extrapolation of these findings to modern humans indicated that the equiv- 
alent group size for our species should be approximately 150 (essentially the 
number of people known personally as individuals). Here, we combine data on 
human grouping in a comprehensive and systematic study. Using fractal anal- 
ysis, we identify with high statistical confidence a discrete hierarchy of group 
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sizes with a preferred scaling ratio close to 3: rather than a single or a continu- 
ous spectrum of group sizes, humans spontaneously form groups of preferred 
sizes organized in a geometrical series approximating 3, 9, 27, ... Such discrete 
scale invariance (DSI) could be related to that identified in signatures of herd- 
ing behavior in financial markets and might reflect a hierarchical processing 
of social nearness by human brains. 

Attempts to understand the grouping patterns of humans have a long history in both sociol- 
ogy Q and social anthropology However, these approaches have been largely ecological 
in focus. In contrast, recent attempts to understand the evolution of sociality in primates have 
focussed in part on the cognitive constraints that may limit the ecological flexibility of group 
size (0111). The social brain hypothesis, as it has come to be known, argues that the evolution of 
primate brains (and in particular, the neocortex) was driven by the need to coordinate and man- 
age increasingly large social groups. Since the stability of these groupings is based on intimate 
knowledge of other individuals and the ability to use this knowledge to manage social rela- 
tionships effectively, the volume of neural matter available for cognitive processing inevitably 
imposes a species- specific limit on group size. Attempts to increase group size beyond this 
threshold result in reduced social stability and, ultimately, group fission. Extrapolating these 
findings to humans led to the prediction that humans had a cognitive limit of about 150 on 
the number of individuals with whom coherent personal relationships could be maintained (6). 
Evidence to support this prediction has come from a number of ethnographic and sociologi- 
cal sources. It has, however, always been recognised that both human and nonhuman primate 
groups are internally highly structured. Further analyses © have indicated that at least one level 
of structuring (the grooming clique) also correlates with neocortex size. While it is not always 
clear what the significance of these tiered groupings is, it is clear that human social groups (like 
those of other primates) consist of a series of hierarchically organised sub-groupings. We first 
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review previous quantifications of group sizes and then provide a systematic analysis. There 
is no universally accepted procedure and all methods attempting of identify group sizes suffer 
from several sources of bias (small sample size, large inter-individual variability, and the crite- 
ria used to include individuals). Our strategy is to include all the reasonable data and attempt to 
extract useful signals above the noise level by a careful analysis of the global data set. 

The core social grouping is called the support clique, defined as the set of individuals from 
whom the respondent would seek personal advice or help in times of severe emotional and finan- 
cial distress, whose mean size is typically 3-5 individuals (H]|9|). Above this may be discerned 
a grouping of 12-20 individuals (often referred to as a sympathy group) that characteristically 
consists of all the individuals with whom one has special ties; these individuals are typically 
contacted at least once a month (HHI). The ethnographic data on hunter-gatherer societies © 
point to a grouping of 30-50 individuals as the size of overnight camps (sometimes referred to 
as bands). These groupings are often unstable, but their membership is always drawn from the 
same set of individuals, who typically number in the order of 150 individuals. This last group- 
ing is often identified in small scale traditional societies as the clan or regional group. Beyond 
these, at least two larger scale groupings have been identified in the ethnographic literature: the 
megaband of about 500 individuals and the tribe (a linguistic unit, commonly of 1000-2000 
individuals) (6). 

We complement the data used in Refs. and sources therein with new data as follows. 

The USA 1998 General Social Survey reports a mean size of 3.3 for support clique (10). The 
sizes of sympathy groups are reported to be 14.0 in Egypt, 15.1 in Malaysia, 13.5 in Mexico, 
13.8 in South Africa (11), 10.2 in USA (12), 15 in The Netherlands (1995) (EHH), 15.0 in 
The Netherlands (1992), 14.3 in The Netherlands (1992-1993), 14.8 in The Netherlands (1995- 
1996), 14.2 in The Netherlands (1998-1999) Oil!), and 14.4 in Mali of West Africa (17). See 
Figure [H 
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Method 1: Average sizes of different network layers. To summarize the previously cited 
data, we denote Si as the mean support clique size, S2 the mean sympathy group, S3 the mean 
band size, S 4 the mean cognitive group size, and S 5 and S 6 the size of small and large tribes. 
Here, we do not address the relevance of this classification (which will be done below) but 
only characterize it quantitatively. The previously cited data gives So = 1 (individual or ego), 
Si = 4.6, S 2 = 14.3, S 3 = 42.6, S 4 = 132.5, S 5 = 566.6, and S 6 = 1728. In order 
to determine the possible existence of a discrete hierarchy, we construct the series of ratios 
Si/Si-i of successive mean sizes: 

Si/Si-x = 4.58, 3.12, 2.98, 3.11, 4.28, 3.05 , for % = 1, • ■ • , 6 . (1) 

This result suggests that humans form groups according to a discrete hierarchy with a prefered 
scaling ratio between 3 and 4: the mean of Sj/S^i is 3.50. 

Method 2: Probability density function and generalized g-analysis of the complete 
data set. In order to avoid any potential biases in the published group classifications defined, 
we employ a more systematic method of analysis that uses all the available data and not just the 
mean group sizes. 

The sample has 61 grouping clusters (including the ego) with size Sj available for i = 
1, 2, • • • , 61. Figure [Upresents the data in a form attributing group sizes to their relevant studies 
in an arbitrary order. We consider this sample to be a realization of a distribution whose sample 
estimation can be written as 

61 

f(s) = J £S(s-s i ), (2) 

i=l 

where 5 is Dirac's delta function. Figure El shows the probability density function f(s) obtained 
by applying a Gaussian kernel estimation approach (7%1) . 

Our challenge is to extract a possible periodicity in this function in the In s variable, if any. 
For instance, if the ratios given in (HJ are genuine, one would expect a periodic oscillation 
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of f(s) expressed in the variable Ins with mean period ln(3.5) = 1.24. This is called "log- 
periodicity" (Ii9l) . 

Standard spectral analysis applied to f(s) is dominated by the trend seen in Figure |2] giving 
a peak at a very low log-frequency corresponding to the whole range of the group sizes. We 
thus turn to generalized g-analysis, or (H, q) -analysis (l20b . which has been shown to be very 
sensitive and efficient for such tasks. The g-analysis is a natural tool to describe DSI in fractals 
and multifractals (|2ill22|) . The (H, g)-analysis consists in constructing the (H, g)-derivative 

D H f(s) ^ m-f(qs) 
« J[S) [(l-q)s] H ' { ) 

Introducing an exponent H different from 1 allows us to detrend f(s) in an adaptive way. Note 
that the limit H = 1 and q — > 1 retrieves the standard definition of the derivative of /. A value 
of q strictly less than 1 allows one to enhance possible discrete scale structures in the data. To 
keep a good resolution, we work with 0.65 < q < 0.95, because smaller q's require more data 
for small s's. To put more weight on the small group sizes (which are probably more reliable 
since they are obtained by conducting general surveys in larger representative populations), we 
use 0.5 < H < 0.9. A typical (H, g)-derivative with H = 0.5 and q = 0.8 is illustrated in a 
semi-log plot in Figure |3 

We then use a Lomb periodogram analysis (l23t to extract the log-periodicity in f(s). Figure 
|4] presents the normalized Lomb periodograms of D^f(s) for different pairs of (H,q) with 
0.5 < H < 0.9 and 0.65 < q < 0.95. This figure illustrates the robustness of our result. For the 
specific values if = 0.5 and q = 0.8 shown in Figure the highest peak is at u>% = 5.40 with 
height P/v = 8.67. The preferred scaling ratio is thus A = e 27r//a;i = 3.20, which is consistent 
with the previous result using the "grouping analysis" CO)- The confidence level is 0.993 under 
the null hypothesis of white noise (l23t. If the underlying noise decorating the log-periodic 
structure is correlated with a Hurst index of 0.6, the confidence level decreases to 0.99; if the 
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Hurst index is 0.7, the confidence level falls to 0.85 

The Lomb periodograms also exhibit a second peak at uj 2 = 9.80 with height Pn = 5.48. 
This can be interpreted as the second harmonic component cu 2 ~ 2cui of the fundamental com- 
ponent at uji = 5.40. The amplitude ratio of the fundamental and the harmonic is 1.26. The 
co-existence of the two peaks at uj\ and cu 2 ~ 2uj\ strengthens the statistical significance of a 
log-periodic structure. To see this, we constructed 10000 synthetic sets of 61 values uniformly 
distributed in the variable Ins within the interval [0, ln(2000)]. By construction, these 10000 
sets, which are exactly of the same size as our data and span the same interval, do not have 
log-periodicity and thus have no characteristic sizes. We then applied the same procedure as 
for the real data set to these synthetic data sets and obtain 10000 Lomb periodograms. We 
then performed the following tests on their Lomb periodograms. Find the highest Lomb peak 
(oj,P n ). If Pn > 8.5, check if there is at least another peak at 2c<j±1 with its P/v larger than 5.5. 
238 sets among the 10000 passed the test, suggesting a probability that our signal results from 
chance equal to 0.024. The probability that there are at least two peaks (one in 4.9 < to < 5.9 
with P N > 8.5 and the other in 9.5 < uj < 11.5 with P N > 5.5) is found equal to 77/10000, 
giving another estimation of 0.993 for the statistical confidence of our results. Another metric 
consists in quantifying the area below the significant peaks found in the Lomb periodogram of 
our data and comparing them with those in the synthetic sets. We count the area of the main 
peak of the Lomb periodogram at cu and add to it the areas of its harmonics whose local maxima 
fall in the intervals \{k — (l/5))u;, (k + (l/5))u;] for k — 2,3, ... around all its harmonics. The 
area associated with a peak is defined as the region around a local maximum delimited by the 
two closest local minima bracketing it. The fraction of synthetic sets which give an area thus 
defined larger than the value found for the real data is 6-7%, depending on the specific values 
H and q used in the analysis. Summarising, all these tests suggest that the evidence in support 
of our hypothesis data is significantly unlikely to result from chance, but rather reflects the fact 



6 



that human group sizes are naturally structured into a discrete hierarchy. 

Method 3: Probability density function and generalised g-analysis of individual net- 
works. We apply the same analysis to individual social networks based upon the exchange of 
Christmas cards in contemporary Western Society (9). This study indicated that contemporary 
social networks may be differentiated on the basis of frequency of contact between individu- 
als, but that both 'passive' and 'active' factors may determine contact frequency. Controlling 
for the passive factors allowed the hierarchical network structure to be examined on the ba- 
sis of residual (active) contact frequency. Starting from the residual contact frequencies, we 
constructed their (H, q) -derivative with respect to the number of people contacted for each in- 
dividual, obtained the Lomb spectrum of the (H, q) -derivative and then averaged them over the 
42 individuals in the study (Figure|5]). The very strong peak at uj = 5.2 is consistent with the 
previous results with a preferred scaling ratio from the expression A = e 27T ^ ~ 3.3 (179b for the 
smaller grouping levels in this study (group sizes below 150). 

Discussion. Putting together a variety of measures collected under a wide range of experi- 
mental conditions and in different countries, we have documented a coherent set of characteris- 
tic group sizes organized according to a geometric series with a preferred scaling ratio close to 
3. Similar hierarchies can be found in other types of human organizations, of which the military 
probably provides the best examples. In the land armies of many countries, one typically finds 
sections (or squads) of about 10-12 soldiers, platoons (of 3 sections, ps 35), companies (3-4 
platoons, « 120 — 150), battalions (usually 3-4 companies plus support units, oc 550 — 800), 
regiments (or brigades) (usually three battalions, plus support; 2500+), divisions (usually 3 reg- 
iments), and corps (2-3 divisions). This gives a series with a multiplying factor from one level 
to the next close to three. Could it be that the army's structures have evolved so as to mimic the 
natural hierarchical groupings of everyday social structures, thereby optimising the cognitive 
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processing of within-group interactions? 

The existence of a discrete hierarchy of group sizes may provide a key ingredient in ra- 
tionalizing the reported existence of discrete scale invariance (DSI) in financial time series in 
so-called "bubble" regimes characterized by strong herding behaviors between investors d25b . 
Johansen et al. (I26ll27l) have proposed a model to explain the observed DSI in stock market 
prices as resulting from a discrete hierarchy in the interactions between investors. Recent anal- 
ysis of DSI in regimes with strong herding component have also identified the presence of a 
strong harmonic at 2u, similar to the findings reported here (I2ffll29l) . The fact that DSI is found 
only during stock market regimes associated with a strong herding behavior suggests that it may 
reflect the fact that a discrete hierarchy of naturally occurring group sizes characterizes human 
interactions whether they be hunter-gatherers or traders. The present work suggests that this 
discrete hierarchy may have its origins in the fundamental organization of any social structure 
and be deeply rooted within the cognitive processing abilities of human brains. 

When dealing with discrete hierarchies, it may be important to distinguish between the 
specific group sizes on the one hand and their successive ratios on the other. It may be that the 
absolute values of the group sizes are less important than the ratios between successive group 
sizes. If the ratio of group sizes is interpreted as a fractal dimension (specifically, the ratio is 
related to the imaginary part of a fractal dimension: see (f79t and references therein), this would 
imply that, depending on the social context, the minimum "nucleation" size may vary, but the 
ratio (close to 3) might be universal. The fundamental question, then, is to determine the origin 
of this discrete hierarchy. At present, there is no obvious reason why a ratio of 3 should be 
important. Equally, however, we have little real understanding of what cognitive mechanisms 
might limit the nucleation point to a particular value. Considerable additional work will need 
to be done on both these components if we are to understand why these constraints on human 
grouping patterns exist and exactly what their significance might be. 



8 



References and Notes 

1. J.S. Coleman, An Introduction to Mathematical Sociology (Collier-Macmillan, London, 
1964). 

2. CP. Kottak, Cultural anthropology (McGraw Hill, New York, 1991). 

3. R. Scupin, Cultural anthropology - a global perspective (Prentice Hall, Englewood Cliffs, 
1992). 

4. R.I.M. Dunbar, Evol. Anthropology 6, 178 (1998). 

5. R.I.M. Dunbar, J. Hum. Evol. 22, 469 (1992). 

6. R.I.M. Dunbar, Behav. Brain Sci. 16, 681 (1993). 

7. H. Kudo, R.I.M. Dunbar, Anim. Behav.62, 111 (2001). 

8. R.I.M. Dunbar, M. Spoors, Hum. Nature 6, 273 (1995). 

9. R.A. Hill, R.I.M. Dunbar, Hum. Nature 14, 53 (2003). 

10. PV. Marsden, Social Networks 25, 1 (2003). 

11. C.J. Buys, Psychol. Rep. 45, 789 (1992). 

12. C. Latkin etal., Social Networks 17, 219 (1995). 

13. S. Kef, Journal of Visual Impairment and Blindness 91, 236 (1997). 

14. S. Kef, J.J. Hox, H.T. Habekothe, Social Networks 22, 73 (2000). 

15. T.G. van Tilburg, M.I.B. van Groenou, J. Soc. Issues 58, 697 (2002). 

16. M.I.B. van Groenou, T.G. van Tilburg, Ageing and Society 23, 625 (2003). 

9 



17. A.M. Adams, S. Madhavan, D. Simon, Soc. Sci. Med. 54, 165 (2002). 

18. B.W. Silverman, Density Estimation for Statistics and Data Analysis (Chapman and Hall, 
London, 1986). 

19. D. Sornette, Phys. Rep. 297, 239 (1998). 

20. W.-X. Zhou, D. Sornette, Phys. Rev. E 66, 0461 1 1 (2002). 

21. A. Erzan, Phys. Lett. A 225, 235 (1997). 

22. A. Erzan, J.P. Eckmann, Phys. Rev. Lett. 87, 3245 (1997). 

23. W. Press, S. Teukolsky, W. Vetterling, B. Flannery, Numerical Recipes in FORTRAN: The 
Art of Scientific Computing (Cambridge University, Cambridge, 1996). 

24. W.-X. Zhou, D. Sornette, Int. J. Mod. Phys. C 13, 137 (2002). 

25. D. Sornette, Why Stock Markets Crash - Critical Events in Complex Financial Systems 
(Princeton University Press, Princeton, NJ, 2003). 

26. A. Johansen, D. Sornette, O. Ledoit, J. Risk 1, 5 (1999). 

27. A. Johansen, O. Ledoit, D. Sornette, Int. J. Theor. Appl. Finance 3, 219 (2000). 

28. A. Johansen, D. Sornette, Int. J. Mod. Phys. C 10, 563 (1999). 

29. D. Sornette, W.-X. Zhou, Quant. Finance 2, 468 (2002). 

30. Research by WXZ and DS was partially supported by the James S. Mc Donnell Foundation 
21st century scientist award/studying complex system. Research by RH and RD was funded 
by the ESRC's Research Centre in Economic Learning and Social Evolution (ELSE). RD's 



10 



research is supported by the British Academy Centenary Project and by a British Academy 
Research Professorship. 



11 



30 r 



25 



* 2 ^ 
U 

S 

£ I5h 
5 s 

as 



!o° 



10 10 10 

Network sizes 



10 



Figure 1: Presentation of our data set of 61 group sizes. The ordinate is an arbitrary ordering 
of data sources and the abscissa gives the group sizes reported in each sources. The symbols 
refer to the classification used in each of the studies: circle (support clique), triangle (sympathy 
group), diamond (bands), stars (cognitive groups), squares (small and large tribes). This clas- 
sification is not used in our systematic analysis summarized in the other figures, to avoid any 
bias. 




Figure 2: Probability density function f(s) of size s estimated with a Gaussian kernel estimator 
in the variable In s with a bandwidth h = 0.14. Varying h by 100% does change f(s) much and 
gives similar results. 
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Figure 3: Typical (H, q) -derivative D^(s) of the probability density f(s) as a function of size 
s with H = 0.5 and q = 0.8. 
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Figure 4: Normalized Lomb periodograms Pn{u) as a function of angular log-frequency uj of 
the (H, g)-derivative D^(s) for different pairs of (H, q) with 0.5 < H < 0.9 and 0.65 < q < 
0.95. The red line gives the averaged Lomb power. 
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Figure 5: Average Lomb periodogram Pn(w) of the (if, q) -derivative D^{s) with respect to 
the number of receivers of the residual contact frequency for each individual in the Christmas 
card experiment, as a function of the angular log-frequency to of the (H, q) -derivative, over the 
42 individuals and different pairs of (H, q) with -1 < H < 1 and 0.80 < q < 0.95. 
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